Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP tblgen #19

Merged
merged 7 commits into from
Jan 5, 2024
Merged

WIP tblgen #19

merged 7 commits into from
Jan 5, 2024

Conversation

jumerckx
Copy link
Collaborator

Not at all finished, but I'm including it here if someone would like to provide feedback. Most notably, it's missing optional attributes. Also, attribute arguments to operation functions are simply expected to be of the right type, in contrast to the haskell code where different attribute types are declared.

example output for the LLVM dialect:
https://gist.github.com/jumerckx/bf892d88021aa5cf4ae037b98c21b0fa

  • a lot of operations aren't generated because of unsupported attributes, this is the same as for the haskell generator.
  • here's an example of where the optional attribute support is needed. I think it makes sense to support optional attributes as kwargs?

@vchuravy
Copy link
Collaborator

This looks great!

Should we use modules for names pacing instead of name prefixes?

@jumerckx
Copy link
Collaborator Author

Yes definitely, removing the prefix is already easily done using --strip-prefix. I've updated the gist with the output using this flag.
CamelCase function names will also need to be fixed.

@jumerckx
Copy link
Collaborator Author

jumerckx commented Sep 1, 2023

I added optional attributes by optionally pushing them in an attributes array. The gist with output for LLVMOps is updated.

function Load(location, type_0, addr_, alignment_, syncscope_)
  
  attributes = []
  (alignment_ != nothing) && push!(attributes, NamedAttribute("alignment", Attribute(alignment_)))
  (syncscope_ != nothing) && push!(attributes, NamedAttribute("syncscope", Attribute(syncscope_)))

  create_operation(
        "llvm.load", location, 
        results = [type_0], 
        operands = [addr_]
        owned_regions = [], 
        successors = [], 
        attributes = attributes,
        result_inference=false
      )
end

Still to-do is making the arguments for optional attributes kwargs with default value nothing.
e.g.

function Load(location, type_0, addr_; alignment_=nothing, syncscope_=nothing)
  ...

When this is done, the generator should hopefully be useful already.

A nice improvement would be to add types to the attribute arguments which can be done using something similar to this map used for the Haskell generator. Where each type of attribute gets associated with a Julia type that can be used to construct that attribute.

I haven't had a lot of time to work on this the past week, and the coming two weeks I'm on holiday but I'll pick up where I left off when I return.
Any feedback on what the generated code looks like, especially with regards to how I'm handling the optional attributes, is appreciated :)

@vchuravy
Copy link
Collaborator

vchuravy commented Sep 1, 2023

Yeah I am unsure if each new attribute should get its own type, but maybe there are classes of attributes we can group together?

Probably looking at what the Python MLIR tools do would make sense

@jumerckx
Copy link
Collaborator Author

Example output for an LLVMAtomicCmpXchg:

function AtomicCmpXchg(location::Location, type_0::MLIRType, ptr_::Value, cmp_::Value, val_::Value, success_ordering_::Union{NamedAttribute, Attribute}, failure_ordering_::Union{NamedAttribute, Attribute}, syncscope_=nothing::Union{Nothing, Union{NamedAttribute, String}}, alignment_=nothing::Union{Nothing, Union{NamedAttribute, Int64}}, weak_=nothing::Union{Nothing, Union{NamedAttribute, Attribute}}, volatile__=nothing::Union{Nothing, Union{NamedAttribute, Attribute}}, access_groups_=nothing::Union{Nothing, Union{NamedAttribute, Attribute}}, alias_scopes_=nothing::Union{Nothing, Union{NamedAttribute, Attribute}}, noalias_scopes_=nothing::Union{Nothing, Union{NamedAttribute, Attribute}}, tbaa_=nothing::Union{Nothing, Union{NamedAttribute, Attribute}})
  attributes = [make_named_attribute("success_ordering", success_ordering_), make_named_attribute("failure_ordering", failure_ordering_)]

  (syncscope_ != nothing) && push!(attributes, make_named_attribute("syncscope", syncscope_))
  (alignment_ != nothing) && push!(attributes, make_named_attribute("alignment", alignment_))
  (weak_ != nothing) && push!(attributes, make_named_attribute("weak", weak_))
  (volatile__ != nothing) && push!(attributes, make_named_attribute("volatile_", volatile__))
  (access_groups_ != nothing) && push!(attributes, make_named_attribute("access_groups", access_groups_))
  (alias_scopes_ != nothing) && push!(attributes, make_named_attribute("alias_scopes", alias_scopes_))
  (noalias_scopes_ != nothing) && push!(attributes, make_named_attribute("noalias_scopes", noalias_scopes_))
  (tbaa_ != nothing) && push!(attributes, make_named_attribute("tbaa", tbaa_))

  create_operation(
        "llvm.cmpxchg", location, 
        results = [type_0], 
        operands = [ptr_, cmp_, val_],
        owned_regions = [], 
        successors = [], 
        attributes = attributes,
        result_inference=false
      )
end
  • I added types to the generated function signatures.
    Optional attributes now have type Union{Nothing, Union{NamedAttribute, [...]}} (nested Union out of laziness, does it matter?). They get a default value of nothing

  • For some attribute types I added corresponding Julia types to be used to construct those attributes using Pangoraw's IR API. (lines 179-186)
    For the remaining attributes, the type can be a regular Attribute. This means some more operations are generated as the program doesn't chicken out when it encounters an attribute it doesn't know. It's left to the user to supply the right attribute as an argument in those cases...

  • To allow for different types arguments for attributes, I added make_named_attribute which is a no-op if a NamedAttribute is passed. (see here) This is somewhat similar to how Python handes things, there a runtime check is performed to see what exactly is passed as an argument (AFAICT).

The gist is updated, I also included the output from Python's tblgen generator.

I didn't yet look closely at how owned_regions and successors are handled as these don't occur very much but this seems like the logical next step.

As always, feedback is appreciated!

@jumerckx jumerckx changed the base branch from vc/tblgen to main September 29, 2023 19:14
@jumerckx
Copy link
Collaborator Author

jumerckx commented Sep 29, 2023

The c++ is not very pretty, but it now generates wrappers for all operations in a host of dialects I tried (Arith, Func, CF, GPU, LLVM, and PDL. All added to the gist).
Wrappers assume the llvm.jl-style context from #17.
If an attribute type isn't known the generated functions expect IR.Attribute.

Currently supported attributes (can be easily extended):

BoolAttr
DenseI32ArrayAttr
F32Attr
F64Attr
I32Attr
I64Attr
I64ArrayAttr
StrAttr

Some operations expect an attribute that has no CAPI function to create it, it seems that we're out of luck for those and need to implement CAPI functions ourselves.
e.g. https://github.com/llvm/llvm-project/blob/2e12fc3d04032be743b2aded354d81d53c5195ec/mlir/include/mlir/Dialect/GPU/IR/GPUOps.td#L37C1-L47C69 used in GPU_ThreadIdOp.
edit: as a stopgap solution, these attributes can be created by parsing their textual representation e.g. parse(IR.Attribute, "#gpu<dim x>")

Should generated wrappers for in-tree dialects be added to the source?

@jumerckx jumerckx marked this pull request as ready for review October 23, 2023 19:37
@jumerckx
Copy link
Collaborator Author

Does jl-generators.cc need to be refactored before merging or is this OK?

Should this code be included as is or should I also generate and include the wrappers for different dialects and LLVM versions?
The dialect wrappers manually created by @Pangoraw are quite a lot nicer to use (e.g. arith.constant takes a Julia value, while Arith.constant takes an MLIR attribute and the value type needs to be specified as well.)

@jumerckx jumerckx marked this pull request as draft October 24, 2023 20:37
@jumerckx
Copy link
Collaborator Author

jumerckx commented Dec 30, 2023

I rewrote the generator from scratch here, with the generated output for some dialects here. Most of everything in one big function but the result is more than half as short and does a bit more. Some refactoring might be useful but I personally think the code is already a bit easier to follow than before.
Some improvements:

  • Operands are now positional arguments, all the other types of arguments are keyword arguments
  • Providing the location argument is now optional, it defaults to IR.Location()
  • If result type inference is supported by the op, the result argument is also optional in the Julia function. Not providing the result will create the operation with inference enabled.

I did a quick smoke test to see whether result inference and operand segment size attributes are correct, and they seem to be.
The code is not yet in this branch but rather on my fork, there I also added an external MLIR artifact to make installing a little easier even before the real Julia artifacts are available. (only a Linux artifact for now)
As it's probably not ideal to hack in an artifact the way I did, I would open a new pull request on top of main here if that's okay?
I think this code is in a way better position for merging anyway :)

Smoke test code sample
using MLIR
using MLIR: IR, API
ctx = IR.Context()
function registerAllDialects!()
    ctx = IR.context()
    registry = API.mlirDialectRegistryCreate()
    API.mlirRegisterAllDialects(registry)
    API.mlirContextAppendDialectRegistry(ctx, registry)
    API.mlirDialectRegistryDestroy(registry)

    API.mlirContextLoadAllAvailableDialects(ctx)
    return registry
end
registerAllDialects!();
API.mlirRegisterAllPasses()
API.mlirRegisterAllLLVMTranslations(ctx.context)

b0 = IR.Block()
cond = IR.push_argument!(b0, IR.MLIRType(Bool), IR.Location())

a = IR.get_result(push!(b0, MLIR.Dialects.Arith.constant(; value=IR.Attribute(42))))
b = IR.get_result(push!(b0, MLIR.Dialects.Arith.constant(; value=IR.Attribute(Int32(43)))))

b1 = IR.Block()
IR.push_argument!(b1, IR.MLIRType(Int), IR.Location())

b2 = IR.Block()
IR.push_argument!(b2, IR.MLIRType(Int), IR.Location())
IR.push_argument!(b2, IR.MLIRType(Int32), IR.Location())

push!(b0, MLIR.Dialects.ControlFlow.cond_br(cond, [a], [a, b]; trueDest=b1, falseDest=b2))

r = IR.Region()
push!.(Ref(r), [b0, b1, b2])

@show MLIR.Dialects.Builtin.module_(; bodyRegion=r)
# "builtin.module"() ({
# ^bb0(%arg0: i1):
#   %0 = "arith.constant"() <{value = 42 : i64}> : () -> i64
#   %1 = "arith.constant"() <{value = 43 : i32}> : () -> i32
#   "cf.cond_br"(%arg0, %0, %0, %1)[^bb1, ^bb2] <{operandSegmentSizes = array<i32: 1, 1, 2>}> : (i1, i64, i64, i32) -> ()
# ^bb1(%2: i64):  // pred: ^bb0
# ^bb2(%3: i64, %4: i32):  // pred: ^bb0
# }) : () -> ()

(pinging @Pangoraw RE: #21 (comment))

@mofeing
Copy link
Member

mofeing commented Dec 31, 2023

Ah great! I just started yesterday to rewrite it from scratch (https://github.com/mofeing/MLIR.jl/tree/pretty-tblgen) to get something like you just have done, but I'll stop if you already have it done.

Anyways, you may find relevant these commits for better docstring formatting.

jumerckx and others added 2 commits December 31, 2023 17:51
Co-authored-by: mofeing <sergio.sanchez.ramirez+git@bsc.es>
@jumerckx
Copy link
Collaborator Author

Ah great! I just started yesterday to rewrite it from scratch (mofeing/MLIR.jl@pretty-tblgen) to get something like you just have done, but I'll stop if you already have it done.

Oh, sorry for the duplicated work!

Anyways, you may find relevant these commits for better docstring formatting.

* [96c6712](https://github.com/JuliaLabs/MLIR.jl/commit/96c6712b9131bd5b1e6f2eb3a81721fab04c69ea) Specifically the following line for Markdown formatting of sections in docstrings. https://github.com/JuliaLabs/MLIR.jl/blob/96c6712b9131bd5b1e6f2eb3a81721fab04c69ea/deps/tblgen/jl-generators.cc#L892

* [3675fd2](https://github.com/JuliaLabs/MLIR.jl/commit/3675fd2633df834e0483f65f0828dd0894cac7ee) Trim whitespaces and newlines at the end of the description.

Thanks a lot, I added these changes and I reset this branch to keep this pr alive. If you have further comments or nits, don't hesitate to let me know!

This PR also includes the output for some dialects in the src/dialects directory but this is more so for easier reviewing. Dialects such as linalg and builtin require multiple .td files (see linalg_unstructured.jl and linalg_structured.jl). I don't know whether it's worth it to fully automate merging of multiple outputs in a single Julia module?

I also just noticed the module name for the memref dialect is incorrectly Arith, will try to fix this.

@jumerckx jumerckx marked this pull request as ready for review December 31, 2023 17:01
@codecov-commenter
Copy link

codecov-commenter commented Dec 31, 2023

Codecov Report

Attention: 2929 lines in your changes are missing coverage. Please review.

Comparison is base (d239921) 41.04% compared to head (ba98a93) 7.08%.
Report is 3 commits behind head on main.

Files Patch % Lines
src/dialects/linalg_structured.jl 0.00% 706 Missing ⚠️
src/dialects/llvm.jl 0.00% 617 Missing ⚠️
src/dialects/arith.jl 0.00% 373 Missing ⚠️
src/dialects/shape.jl 0.00% 311 Missing ⚠️
src/dialects/memref.jl 0.00% 247 Missing ⚠️
src/dialects/complex.jl 0.00% 221 Missing ⚠️
src/dialects/pdl.jl 0.00% 120 Missing ⚠️
src/dialects/amdgpu.jl 0.00% 109 Missing ⚠️
src/dialects/affine.jl 0.00% 91 Missing ⚠️
src/dialects/func.jl 0.00% 38 Missing ⚠️
... and 5 more

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #19       +/-   ##
==========================================
- Coverage   41.04%   7.08%   -33.96%     
==========================================
  Files           6      20       +14     
  Lines         614    3542     +2928     
==========================================
- Hits          252     251        -1     
- Misses        362    3291     +2929     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mofeing
Copy link
Member

mofeing commented Dec 31, 2023

Oh, sorry for the duplicated work!

Nah, don't worry. Happy this advances!

This PR also includes the output for some dialects in the src/dialects directory but this is more so for easier reviewing. Dialects such as linalg and builtin require multiple .td files (see linalg_unstructured.jl and linalg_structured.jl). I don't know whether it's worth it to fully automate merging of multiple outputs in a single Julia module?

The problem with this is that different LLVM versions have different dialects available, so depending on the Julia version, some dialects might error?

(In reality, it shouldn't error because create_operation let's you type any operation from any dialect even if it's not registered... but I don't know how the other parts will work).

One suggestion: Wouldn't it be more practical to keep the original name of the dialects? E.g. affine instead of Affine

@jumerckx
Copy link
Collaborator Author

jumerckx commented Dec 31, 2023

The problem with this is that different LLVM versions have different dialects available, so depending on the Julia version, some dialects might error?

I see, I haven't really been considering older MLIR versions as everything evolves so quickly. E.g. the generated output for affine.loop isn't quite right because the tablegen file didn't fully specify the arguments. This was fixed only 3 months ago. https://github.com/llvm/llvm-project/blame/949ec83eaf6fa6dbffb94c2ea9c0a4d5efdbd239/mlir/include/mlir/Dialect/Affine/IR/AffineOps.td#L230-L235
I'm not very familiar with binarybuilder but I imagine in the future it might be possible to run this script in the artifact generation step, on all relevant dialects, for each MLIR version such that each version only has dialects available that are actually supported?

One suggestion: Wouldn't it be more practical to keep the original name of the dialects? E.g. affine instead of Affine

I opted for a capital first letter to align closer the naming convention for Julia modules https://docs.julialang.org/en/v1/manual/style-guide/#Use-naming-conventions-consistent-with-Julia-base/. But I see the advantage of staying with MLIR's naming convention here as well... Either way, I don't have any strong preference.

edit: the dialect's getName method doesn't return the short mnemonic string, for affine it returns "Affine_Dialect" for example. Not sure where to get the short lowercase name from.

@mofeing
Copy link
Member

mofeing commented Jan 1, 2024

I'm not very familiar with binarybuilder but I imagine in the future it might be possible to run this script in the artifact generation step, on all relevant dialects, for each MLIR version such that each version only has dialects available that are actually supported?

That's not how BinaryBuilder or Yggdrasil works: it only provides the external binaries. The bindings are delegated to other packages (like this one).

In my opinion, the generator script should be run during the build step (i.e. in deps/build.jl) and generate the appropriate dialects for the corresponding LLVM version.

I opted for a capital first letter to align closer the naming convention for Julia modules https://docs.julialang.org/en/v1/manual/style-guide/#Use-naming-conventions-consistent-with-Julia-base/. But I see the advantage of staying with MLIR's naming convention here as well... Either way, I don't have any strong preference.

This can be easily be solved with const aliases.

@mofeing
Copy link
Member

mofeing commented Jan 1, 2024

edit: the dialect's getName method doesn't return the short mnemonic string, for affine it returns "Affine_Dialect" for example. Not sure where to get the short lowercase name from.

The original Haskell bindings use getDialectName instead of getName. Check out https://github.com/google/mlir-hs/blob/1cc6c641fe81446f16eec052104feac5de217569/tblgen/hs-generators.cc#L269-L284

@jumerckx
Copy link
Collaborator Author

jumerckx commented Jan 1, 2024

The original Haskell bindings use getDialectName instead of getName. Check out google/mlir-hs@1cc6c64/tblgen/hs-generators.cc#L269-L284

Ok, module naming is updated, thanks for the pointer.

In my opinion, the generator script should be run during the build step (i.e. in deps/build.jl) and generate the appropriate dialects for the corresponding LLVM version.

I didn't realise this was something that was possible but sounds like a much better approach :)

@Pangoraw
Copy link
Member

Pangoraw commented Jan 2, 2024

The dialect wrappers manually created by @Pangoraw are quite a lot nicer to use (e.g. arith.constant takes a Julia value, while Arith.constant takes an MLIR attribute and the value type needs to be specified as well.)

Can we automatically call IR.Attribute for attributes which are not of type Attribute?
This way, with result inference one can just type:

arith.constant(value=1f0) # %cst = arith.constant 1.000000e+00 : f32
arith.constant(value=12) # %c12_i64 = arith.constant 12 : i64

One drawback is that it can be a bit of a confusing API.

@mofeing
Copy link
Member

mofeing commented Jan 3, 2024

@jumerckx I've opened jumerckx#1 in your fork that runs the binding generator on build.

@mofeing
Copy link
Member

mofeing commented Jan 3, 2024

arith.constant(value=1f0) # %cst = arith.constant 1.000000e+00 : f32
arith.constant(value=12) # %c12_i64 = arith.constant 12 : i64

MLIR doesn't name operands, it can be confusing. It should be something like:

arith.constant(1f0) # %cst = arith.constant 1.000000e+00 : f32
arith.constant(12) # %c12_i64 = arith.constant 12 : i64

@Pangoraw
Copy link
Member

Pangoraw commented Jan 3, 2024

MLIR doesn't name operands, it can be confusing. It should be something like:

value is not an operand in this case, but an attribute (https://mlir.llvm.org/docs/Dialects/ArithOps/#attributes-3). IIUC, the wrapper use positional arguments for operands.

Co-authored-by: jumerckx <31353884+jumerckx@users.noreply.github.com>
@mofeing
Copy link
Member

mofeing commented Jan 5, 2024

Now that my PR has been merged, dialect bindings are generated on the fly!

There are some some things left to do:

  •  Select the LLVM_full_jl version corresponding to the running Julia version
    • Can we run Pkg.install inside build.jl for the deps environment?
    • Currently fixed to v15
  • Select the target dialects corresponding to the LLVM_full_jl version
  • Fix the dialects that generate crashing Julia code (currently commented out)
  • Generate other constructs like enumerations (e.g. arithmetic.Predicates).

Maybe we could merge this PR and open PRs for these tasks.

@jumerckx
Copy link
Collaborator Author

jumerckx commented Jan 5, 2024

  • Fix the dialects that generate crashing Julia code (currently commented out)

I could look into this today already, haven't yet checked what's actually going wrong.
edit: it's just an incomplete list of reserved keywords in the generator, I'll push a fix in a separate pr once this is merged.

@jumerckx
Copy link
Collaborator Author

jumerckx commented Jan 5, 2024

So is it ok if I merge this? Anyone wanna give me the go-ahead? :)

@Pangoraw
Copy link
Member

Pangoraw commented Jan 5, 2024

A few more things that come to my mind:

  • Update toy brutus.
  • Run the build step in the CI (or is it run automatically?).

@mofeing
Copy link
Member

mofeing commented Jan 5, 2024

  • Run the build step in the CI (or is it run automatically?).

The build step is run automatically. See https://pkgdocs.julialang.org/v1/creating-packages/#Adding-a-build-step-to-the-package

CI is currently failing on Julia 1.9 because Julia 1.9 uses LLVM 14 and I've fixed LLVM_full_jl to v15.

Fixing the Select the LLVM_full_jl version corresponding to the running Julia version check would solve it.

@mofeing
Copy link
Member

mofeing commented Jan 5, 2024

Ahh I think selecting the LLVM_full_jll version based on Julia version is easier than I thought. Will open a PR when this is merged.

@jumerckx jumerckx merged commit baced91 into JuliaLLVM:main Jan 5, 2024
0 of 3 checks passed
@jumerckx jumerckx deleted the vc/tblgen branch January 5, 2024 16:26
This was referenced Jan 5, 2024
@vchuravy
Copy link
Collaborator

We should not be depending on LLVM_full_jll, rather I would do something similar to what we do in https://github.com/JuliaLabs/MLIR.jl/blob/26157615dda7e09bc2e3bd33cd81aeb715453a9e/res/wrap.jl#L21 and pre-generate and version the Julia files (and avoid any build steps)

Then MLIR.jl Oly has to check what version of LLVM Julia is using

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants